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A RE-TARGETABLE COMMUNICATION SYSTEM 
HTM n OF THF INVENTION 

This invention relates to communications technologies generally and 
particularly to a re-targetable communication system. 
K ArKr.ROIIND Of THE INVENTION 

Many of the existing communication apparams designs utilize Hxed function 
hardware accelerator(s), digital signal processing (hereinafter DSP) cores or a 
combination of the two to carry out functions that are specified by vanous 
communicationssta„dards.Someexamplesotthesecommunicationsstandardsare 

for digital subscriber lines, cable modems, integrated services digital network, T-1 
lines, wireless communications, analog and digit^d modems, etc. Because 
communications standards tend to evolve over time, system designers and architects 
often favor designs that are sufficiently flexible to adopt such evolution. 

Unlike their fixed function hardware counterpart. DSP cores often provide the 
requisite flexibility and the processing capabilities to support toctions of one 
communications standard. However. DSP cores are relatively expensive and have 
relatively sizable physical dimensions. Furthermore, designs that attempt to utilize 
DSP cores alone typically fail to handle multiple communications standards, 
especiallythe standards for high-speed communications, inacost-effective manner. 
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An altematWe prior ar, approach is to utilize Hxed ft^ncuon hardware, such as 
Application SpeciHc Integrated Circuits (hereinafter ASICs). ,n combination with 
DSP cores. In particular, the approach dedicates the ASICs to execute certain 
operations in order to alleviate any resource constraints that the DSP cores may 
encounter. However, ASICs lack the nexibility of a programmable device. Thus. 
th,s approach rs likely to only work cost effectively for a fixed number and set of 
communications standards. In other words, a system resulting from the approach is 
neither capable of effectively adjusting to changes in its set of communications 
standards, nor is the system scaleable to efficiently accomtrtodate a varytng number 
of communications standards. 

Therefore, in order to further improve the price/performance of 
communication gears, an apparatus and a design approach is needed to provide a 
flexible, programmable and highly scaleable solution for such gears to handle 
multiple communications standards in a cost effective manner. 



BCIFF nKSCRlP -'"'^ •^"'^ "RAWINGS 

The present invention is illustrated by way of example and is no. limited by 
the figuresoftheaccompar^ying drawings, in which like references indicate similar 

elements, and in which: 

Figure 1 illustrates a block diagram of one embodiment of the present invention, a 
re-targetable communication system. 

Figure 2 rllustrates a general block diagram of one embodiment of a scaleable 
function unit. 

Figure 3 illustrates a block diagram of a genera-purpose computer system, which 
includes one embodiment of a re-largetable communication system. 
Fig„re4illustratesablock diagram of one embodiment ofacomplex arithmetic 

element. 

Figure 5(a) illustrates ablock diagram of one embodiment of an arithmetic unit. 
Figure 5(b) illustrates a block diagram of one embodiment of a 
Multiplier/ Accumulator engine. 

Figure6illustratesablock diagram of one embodiment ofadata router. 



nFTAIl.RDDFSCRIPTION 

A re-targetable communication system is disclosed. In the following 
descriptton, numerous specific details are set forth in order to provide a thorough 
understanding ofthepresentinvenfon. However.it will be apparent tooneof 

ordinary skill in .he art that the invention may be practiced without these particular 
details. In other instances, well known elements and theories have not been 
described in special detail in order to avoid obscunng the present invenuon. 

Figure 1 iUustratesablock diagram of one embodiment of the present 
invention, re-targetable communication system 100. Specifically, one 
inrplementation of re-targetable communicatron system 100 involves a s.ngle 
integrated circuit (hereinafter IC) devrce and mainly includes connectivity uni, 102, 
drgital signal processing (hereinafter DSP) core 104 and a number of scaleabie 
actional units (hereinafterSFU),suchasSFU 106. This single-ICembodtment of 

re-targetable communication system 100 is also referred to as a re-targetable 

communicafion processor in the subsequent discussions. 

Connectivity unit 102 is designed to genetically operate with any number and 
types of plug-in modules. Thus, adding or removing a plug-in module would not 
involve a re-design of connectivity unit 102. In addition to the mentioned DSP core 
104 and a number of the SFUs, some examples of tfre plug-in modules can be. but 
not Umited to. memory 108. media access control processor 1 10, analog-to-digital 
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converter 112, additional DSP cores, Micro.con.oller cores, etc. DSPcore 104,on 
,he other hand, broadly refers to a programmable computational unit that performs 
the mathematics involved in digital signal processing algorithms. 

One embodiment of connectivity unit 102 farther includes internal system bus 
1 14, digital input/output interface 1 16 and external bus interface 118. Digital 
input/output interface 1 16 allows communications system 100 to handle parallel 
input/output, interrupt requests, direct memory access, reset events, etc. On the other 
hand, external bus Interfax 118 allows commumcation system 100 to communicate 
with other processors) 120 including other re-targetable communications processors, 
which may or may not physically reside in the same system or apparatus that re- 
targetable communication system 100 .sin. Lastly, internal system bus IMprovides 
a common path for the plug-in modules and the various interfaces to communicate 

among one another. 

Figure2illustrates ageneral block diagram of one embodimentofSFU 106. 

For illustration purposes, the following discussions assume that this embodiment 
ntatnly operates as a numeric accelerator that has been optimized to execute digital 
signal processing algorithms. I. should however be noted that SFU 106 could apply 
to other types of operations, such as forward error correction operations. 
Additionally, although this disclosure mamly describes re-targetable communication 
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system 100 wUhastogleSFU^e present invention is capable of supponing as many 

SFUs as its design and cost parameters permit. 

SFU 106 includesanumberof temovablecomplex arithmetic elements 

(hereinafter CAEs) that are opttmized for mathematically intense operations, such 
as, but not limttedtcFastFourierTransfonns (hereinafter FFr),Uast-Mean-Square 

(hereinafter LMS) adaptive filters, LMS echo cancellations. LMS adaptive 

, , , equalizers, Finite Impulse Response (hereinafter FIR) filter, convolution, 

g mterpolation, decimation, tuners, resamplers, etc. SFU 106 aiso has an inter-CAE 

i bus controller 200 and local memory 206. 

i inter-CAEbuscontroller200notonlybridgescommumcationsbetweenSFU 

I ,06andintemalsystembusll4ofcomtectivityunitl02,bmitalsoregulatesdata 

1 ^coninter-CAEbus202.EachCAEhaswestport218andeastport220that 

1 allow direct communications with its neighboring CAEs. For example, CAE 208 has 

° ai.ectconnectionswithitswestneighboringCAE,orCAE204,itseastneighbori„g 
CAE,orCAE210. The direct connections between CAEs help ease sometraffic on 

inter-CAE bus 202. Aside from communicaung with its neighboring CAEs, each 
CAE also can communicate with its non-neighboring CAES via inter-CAE port 222 

andinter-CAEbus202. In addition, all CAEs have access to local memory 206, 
whichoftencotttainslookup tables for information such as, but not limited to,sine 
and cosine values, magnitude and phase angle, symboldecisions, etc. Because the 



individual CAE has a certain amount of processing capability and the CAEs in SFU 
106 operate in parallel, the overall processing capability of SFU 106 is directly 
related to the number of CAEs in SFU106. In other words, SFU 106 is readily 
scaleable by varying the number of CAEs that it has. 

r. p.r.tion of One E '^h'vlime.ni of a Comrleii Arithmetic ElemenJ 
:„ n.. Bmhnrtiment of a Re -T.r»eiaMe Communication System 
Figure 4 illustrates a block diagram of one embodiment of a CAE, such as 
CAE 204 as shown in Figure 2. Specifically, CAE 204 includes sequencer 400, CAE 
memory 402, arithmetic unit 404 and data router 406. Sequencer 400 is responsible 
for generating addresses 406 for CAE memory 402 and for issuing control 
information 408 to arithmetic unit 404. Data router 406 is responsible for providing 
CAE 204 connections to both its neighboring and non-neighboring CAEs and for 
1 „utingappropdatedatatosequencer400andCAEmemory402.CAEmemory402 

provides temporary data storage for arithmetic unit 404. 

in response to control information 408 from sequencer 400, arithmetic unit 
404 proceeds to execute certain targeted operations on data stored in CAE memory 
402. In one embodiment, arithmetic unit 404 operations span several clock cycles. 
Control information 408 also similarly spans several clock cycles to match arithmetic 
unit 404. The subsequent paragraphs use one type of digital signal processing 
operation, the LMS adaptive filter to describe one optimized implementation of 



arithmetic unit 404. The LMS adaptive filter generally follows the steps set forth 
below: 

1) performing a dot product between the input data to the filter and the filter 
coefficients; 

2) calculating the error between the output of the filter and a desired output 
response of the filter; 

3) adjusting the filter coefficients in response to the calculated error; and 

4) continuously repeating steps 1 - 3 while the calculated error drops to an 
acceptable level. 

Moreover, for optimal performance of this embodiment of arithmetic unit 
404, CAE memory 402 includes two banks of separately addressable 64-bit wide data 
memories. The data memories may store 32-bit complex numbers (16-bit real and 
16-bit imaginary), 64-bit long complex numbers (32-bit real and 32-bit imaginary), 
16-bit real numbers, 32-bit long real numbers and 64-bit very long real numbers. 

Figure 5(a) illustrates one such embodiment of arithmetic unit 404. 
Specifically, the embodiment includes register file 500 and four multiplier- 
accumulator (hereinafter MAC) engines, 502, 504, 506 and 508 respectively. Each 
MAC engine is coupled to other MAC engines, register file 500 and the two banks of 
data memories, 5 1 8 and 520 respectively. For this LMS adaptive filter example, data 
memory 518 contains input data to the filter, and data memory 520 stores coefficient 



infonnation of the filter. This combinafon of four MAC engines and two separately 
addressable data memories allow arithmetic unit 404 to perform, for instance, one 
32-bi. by 32-bit complex number or four 16-bit by 16-brt real number operations 
simultaneously. 

Each MAC engine further includes four main functional blocks. Figure 5(b) 
niustrates one embodiment of such a MAC engine. The four blocks are pre-adder 
5 10. multiplier 5 12, accumulator 514 and data packmg block 516. These blocks 
operate in accordance to control information 408 from sequencer 400 as shown in 
Figure 4. Pre-adder 510 essentially sums up data from register file 500, which 
contains data from memories 518. Though in one implementation, based on control 
information 408, pre-adder 5 10 may further format the output of register file 500 
and/or format its own summadon output. 

Multiplier 512 accepts data from both data memories 518 and 520 and pre- 
adder 5 10 and is mainly responsible for performing the multiplication between the 
filter's input data and the filter coefficients. In one embodiment, multiplier 512 has 
the capability to multiply either the output of pre-adder 510 or the data from data 
memories 518 with the filter coefficients from data memory 520. Furthermore, this 
embodiment of multiplier512includesaprogrammableshifter at the output of the 

multiplication, which aUows arithmetic unit 404 to adjust the filter coefficients 
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efficiently. The progranw-abilityofthis shifter refers .0 the shifter-sabUity 10 shift 
right orlef.avaryingnumberofbitposaonsaccordtng.0 con.ro, informauon408. 

Accumulator 514 accepts and sums up data from data memones 518 and 520. 
other MAC engines and multiplier 512. Similar to the menttoned embodiments of 
pre-adder 5 10 and multiplier 5 12, one embodiment of accumulator 5 14 has the 
flextbility to sumaselected multiplication output and da,afromdatamemories518 

and 520 in accordance to control signal 408. The embodiment also allows 
!j accumula,or514toformatthedatabeforeandaftertheadditio„operation. After 

i accumula,or514handsoffdatatodatapackingblock516,datapacldngblock516 
W organizes the data into a predefined format, such as 64-b,t words. 

i Although the disclosed embodiment of arithmetic unit 404 enables CAE 204 

1 ,„efficie„tlyexecutetheLMSadaptivefilteroperations,thepresentinvention 
i f„„,ercouplesCAE204t„otherCAEs,eachotwhichalsocontainsthedisclosed 
° arithmeticunit404s,sotha.theyoperateinparallel. The coupling of the CAEs is 

accomplished through data router 406 as shown in Figure 4. 

Figure 6 illustrates a general block diagram of one embodiment of data router 
406. h, particular, the embodiment mcludes control logic 600, multiplexer 602. 
inter-CAE bus interface 604,firs.-in-first-out (hereinafter FIFO)buffer606,FIFO 
buffer «)8. and ,egis.er610. It shouldbe noted that the following discussions on. 
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,a.a»u.er406 wou,dn,aUea™n.berof references toe,e™e„csa,us„a,edi„H^^^^^ 



2 and 4. 



control logic 600 managesthe data flow .oCAE204'ssequencer 400 and 
CAEn.en.ory402,nelghbor,ngCAEsand,n.er-CAEbus202.Specmcal.y,one 
e.bod,.en.of control logic««useslnformaUon such as,bu.not«edro, 
desrlnaaondev.erde„rrf,ca.ions6n,ands.a.uss,gnals614»d616.nd.carWe^^ 
,eav«of.hedest,naao„ devices, etc. roge„era.eanun,.erofcontrol and 
.atusslgn.s. Des„na,lon device Idenimcatro„s612arederlvedfron,s,nals618, 
620, 622and 624. Slgnal6.8representsda.a.ha.CAE204 receives vlalrseas. port 

. A ta fmm CAE 204's sequencer 400 and arithmetic unit 
220. Signal 620 represents data from CAb ZU4 bcq 

404 s,gnal622represen.sdata,hatCAE204receivesviartsinter-CAEport222 
.o^mter-CAEbus 202. LastLsrgnal 624 represents datathatCAE204 receives 

via its west port 218. 

on the other hand, status signal 614 conres from neighbonng CAEs of CAE 
,04 whtch,„dlcatetheabllltyofthenetghbon„gCAEstoaoceptdata. Status signal 
6,6con.esfron,lnter.CAEbuslnterface604,whlchlnd,catestheava.,ab.Utyofthe 
non.nelghborlngCAEsonlnter-CAEbus202toacceptda.afror„CAE204. One 
err^bodlnrentcflnter-CAEbus interface 604 subnets requests tomter-CAE bus 
controller 200 to accesspartlcularnon-neighborlngCAEs that are specfedby 
,estlnatlondevlceldentlficaUons612.Inter-CAEbuslnterface604thenre.aysthe 
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.spo„sefro™in.er-CAEbusco„«oner200.oco„«onogic600inthefo™ofs,a^^ 



signal 616. 



to 
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Bs.a.ussignals614a„d616i„dicate.hat.hedesti„.UO„ devices areavailaWe 
receive da.a,control.ogic 600 .hen issues certain conuol signals to dnveda,a.o 
.heappropnatedesUnationdevices. For instance. con.roUog.600n.ayassen 
registerenablesignal626todnvedataten,porar,lystoredinregister610.o 
,ghbon„gCAEs.Alte™ative,y,control.ogic 600 may assert multiplexer control 

signd 628 to instruct multiplexer 602 to pass through certain mtormation ,o 
,e,ue„cer 400and/orCAE memory 402. CertatndataareplacedtnHFO 606 and 

FIFO 608 before they are drtven to their Hnal destinations. These FIFOs are 
providedtosmoo,houtanypeakcongestioncondttionstha.datarouter406may 
experience. After datarouter 406 places data tn FIFOS 606 and 608, control logic 
600thenassertsstatussigna,s630toindicatethatdatarouter406isavailableto 

receive new data. 

FigureSillustratesablockdiagramotgeneral-purpose computer system 300 

thatincludesoneembodimentof re-targetablecommunicationsystem 100. 
Specifically,re-targe,ablecommunica,ionsyste,nlOO resides on add-on card334, 

whichcouples to I/Obus 328. Together with add-on card334,re-targetable 
communicationsystemlOOhandlesmulupletypesofcommuntcationdatafor 
computer system300. Some examples of the communication data are, but not 



H™,ed.o, data .haUonfoHntosCandardsfor digital subscriber iines, cable n,oden,s, 
integrated services digital network,T-l.,nes,w,relessconm«n,catio„s.modems, 

Thegeneral-purposecomputersystemarchitecturecomprisesnucroprocessor 
302 and cache tnemory 306 coupledtoeach other throughprocessor bus 304. 
Santple computer system 300 alsotncludes high performance systembus 308 and 

standard I/O bus 328. Coupled to high performance system bus 308 are 
™croprocessor302andsystemcontroner3.0.Additionally,systemcontrol.er310 

is coupled to memory subsystem316throughchannel314,is coupled to VO 
controller hub 326 through link324and is coupled to graphtcs controller 320 through 

interface 322. coupled to graphicscontroller320,svideodisplay318. Aside from 
,i,ementio„edadd-oncard334,coupledtostandardl/0bus328arel/0controller 
hub326,massstorage330andalphanumericinputdeviceorotherconventional 

input device 332. 

These elements perform their conventtonal functions well known in the art. 
Moreover.itshouldhavebeenapparenttooneordinarilysldlledintheartthat 
computer system 300 could be designed with multiple microprocessors 302 and may 
have more components than thatwhich is shown. It should also havebeen apparent 
,„ one with ordinary skUI in the art to impleme„,re-targetable communication system 
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100 in Cher systems than computer sys.etn 300 without exceeding the scope of the 
present invention. 

Thus, a re-targetable communication system has been descnbed. Although 
the present has been descrrbed particularly with reference to the ftgures and to 
specific examples,itwillbeappa,entto one of theordtnary skill in the art that the 

presentinventtonmayappear in anyofanumberof other communication system 
architectures. I. is contemplated that manychanges and modifications may be made 

by one of ordmary skill in the art without departing from the spirit and scope of the 

present invention. 



